Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support specification of multiple scaling strategies #566

Merged
merged 1 commit into from
Oct 5, 2023

Conversation

EddyMM
Copy link
Contributor

@EddyMM EddyMM commented Oct 4, 2023

Description of changes

  • There is need to support more than 2 scaling strategies (i.e. all-or-nothing and best-effort).
  • These changes refactor the code base to accommodate more than 2 scaling strategies.
  • This has been done by using an Enum that defines/enumerates the possible strategies
  • The mapping between the Enum values and existing all_or_nothing_batch configuration parameter has been done as follows:
    • all_or_nothing_batch=False maps to ScalingStrategy.BEST_EFFORT
    • all_or_nothing_batch=True maps to ScalingStrategy.ALL_OR_NOTHING
    • all_or_nothing_batch missing/undefined maps to ScalingStrategy.ALL_OR_NOTHING
  • The existing unit tests have been updated to use this Enum.

Tests

  • Updated existing unit tests and added a unit test to check the behaviour of the ScalingStrategy Enum based on various values used to initialise it
  • Submitted jobs matching the ones done here #564 and confirmed that we get the same output

References

  • N/A

Checklist

  • Make sure you are pointing to the right branch.
  • If you're creating a patch for a branch other than develop add the branch name as prefix in the PR title (e.g. [release-3.6]).
  • Check all commits' messages are clear, describing what and why vs how.
  • Make sure to have added unit tests or integration tests to cover the new/modified code.
  • Check if documentation is impacted by this change.

Please review the guidelines for contributing and Pull Request Instructions.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov
Copy link

codecov bot commented Oct 4, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (ea58da0) 89.92% compared to head (5852999) 89.98%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #566      +/-   ##
===========================================
+ Coverage    89.92%   89.98%   +0.05%     
===========================================
  Files           16       16              
  Lines         2689     2705      +16     
===========================================
+ Hits          2418     2434      +16     
  Misses         271      271              
Flag Coverage Δ
unittests 89.98% <100.00%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/slurm_plugin/common.py 82.08% <100.00%> (+4.31%) ⬆️
src/slurm_plugin/instance_manager.py 100.00% <100.00%> (ø)
src/slurm_plugin/resume.py 80.99% <100.00%> (+0.15%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/slurm_plugin/common.py Fixed Show fixed Hide fixed
@EddyMM EddyMM force-pushed the wip/support-multi-scaling-strategies branch from b995f81 to 05e8a0a Compare October 4, 2023 12:10
@EddyMM EddyMM changed the title Wip/support multi scaling strategies Support specification of multiple scaling strategies Oct 4, 2023
- Use a more extensible configuration parameter for the scaling strategy.
- Default to all-or-nothing in case of missing or invalid scaling strategy.

Signed-off-by: Eddy Mwiti <[email protected]>
@EddyMM EddyMM force-pushed the wip/support-multi-scaling-strategies branch from f3197cc to 5852999 Compare October 4, 2023 13:08
@EddyMM EddyMM marked this pull request as ready for review October 4, 2023 13:18
@EddyMM EddyMM requested review from a team as code owners October 4, 2023 13:18
Comment on lines +1014 to +1017
# At instance launch level, the various scaling strategies can be grouped based on the actual
# launch behaviour i.e. all-or-nothing or best-effort
all_or_nothing_batch = scaling_strategy in [ScalingStrategy.ALL_OR_NOTHING]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this comment.

all_or_nothing_batch is actually a boolean that is passed downstream to FleetManagerFactory as all_or_nothing ..so maybe it would be better to rename this parameter accordingly since is not clear if/what the _batch adds

While, regards doing scaling_strategy in [ScalingStrategy.ALL_OR_NOTHING]
instead of scaling_strategy == ScalingStrategy.ALL_OR_NOTHING:

as done a few lines above, I assume that this is because we are planning to add another strategy that falls in the "all or nothing" category.

Is this the meaning of the comment above ?

Copy link
Contributor Author

@EddyMM EddyMM Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all_or_nothing_batch the _batch alludes to the way we chunk the launch attempts in "batches" of a certain size/count of instances. So all_or_nothing_batch implies that each batch will be done in an all-or-nothing fashion.

as done a few lines above, I assume that this is because we are planning to add another strategy that falls in the "all or nothing" category.

Exactly, it was done with this in mind. That expression will be reused in the upcoming PR.

Comment on lines +1236 to +1239
# At fleet management level, the scaling strategies can be grouped based on the actual
# launch behaviour i.e. all-or-nothing or best-effort
all_or_nothing_batch = scaling_strategy in [ScalingStrategy.ALL_OR_NOTHING]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants